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Abstract: Simultaneous Localization and Mapping (SLAM) is perhaps the most 
fundamental problem to solve in robotics in order to build truly autonomous mobile robots. 
The sensors have a large impact on the algorithm used for SLAM. Early SLAM 
approaches focused on the use of range sensors as sonar rings or lasers. However, cameras 
have become more and more used, because they yield a lot of information and are well 
adapted for embedded systems: they are light, cheap and power saving. Unlike range 
sensors which provide range and angular information, a camera is a projective sensor 
which measures the bearing of images features. Therefore depth information (range) 
cannot be obtained in a single step. This fact has propitiated the emergence of a new family 
of SLAM algorithms: the Bearing-Only SLAM methods, which mainly rely in especial 
techniques for features system-initialization in order to enable the use of bearing sensors 
(as cameras) in SLAM systems. In this work a novel and robust method, called Concurrent 
Initialization, is presented which is inspired by having the complementary advantages of 
the Undelayed and Delayed methods that represent the most common approaches for 
addressing the problem. The key is to use concurrently two kinds of feature representations 
for both undelayed and delayed stages of the estimation. The simulations results show that 
the proposed method surpasses the performance of previous schemes. 

Keywords: bearing-sensor; SLAM; robotics 



Sensors 2010, 10 



1512 



1. Introduction 

Simultaneous Localization and Mapping (SLAM) is perhaps the most fundamental problem to solve 
in robotics in order to build truly autonomous mobile robots. SLAM treats of the way how a mobile 
robot can operate in an a priori unknown environment using only onboard sensors to simultaneously 
building a map of its surroundings which uses to track its position. 

The robot's sensors have a large impact on the algorithm used for SLAM. Early SLAM approaches 
focused on the use of range sensors as sonar rings or lasers e.g., [1]. Nevertheless there are some 
disadvantages with the use of range sensors in SLAM: correspondence or data association is difficult; 
they are expensive and some of them are limited to 2D maps and computational overhead due to large 
number of features (see [2,3] for a complete review). 

The aforementioned issues have propitiated that recent work moves towards the use of cameras as 
the primary sensing modality. Cameras have become more and more interesting for the robotic 
research community, because they yield a lot of information for data association, although this 
problem remains latent. Cameras are well adapted for embedded systems: they are light, cheap and 
power saving. Using vision, a robot can localize itself using common objects as landmarks. 

On the other hand, while range sensors {i.e., laser) provide range and angular information, a camera 
is a projective sensor which measures the bearing of images features. Therefore depth information 
(range) cannot be obtained in a single frame. This fact has propitiated the emergence of a new family 
of SLAM methods: The Bearing-Only SLAM methods, which mainly rely in especial techniques for 
features system-initialization in order to enable the use of bearing sensors (as cameras) in 
SLAM systems. 

In this context, a camera connected to a computer becomes a position sensor which could be applied 
to different fields such as robotics (motion estimation for generally moving robots humanoids), 
wearable robotics (motion estimation for camera equipped devices worn by humans), tele-presence 
(head motion estimation using an outward-looking camera), or television (camera motion estimation 
for live augmented reality) [4]. 

Usually the Bearing-Only SLAM has been associated with vision-based SLAM systems, possibly 
because cameras are by far the most popular bearing sensor used in robotics. In that sense, the use of 
alternative bearing sensors {i.e., auditory sensing) for performing SLAM has been much less explored. 
Nevertheless in an authors' previous work [5], a Sound-Based SLAM system is proposed where sound 
sources are used as map features and thus showing the viability on the inclusion of the hearing sense in 
SLAM and the use of alternative bearing sensors. 

In recent years several important improvements and variants to this kind of methods have 
appeared [6,7]. Also different schemes for increasing the number of features managed into the map 
have appeared [8]. Nevertheless the initialization process of new features is still the most important 
problem for addressing in Bearing-Only SLAM in order to improve the robustness. 

In this work a novel and robust method called Concurrent Initialization is presented which is 
inspired by having the complementary advantages of the Undelayed and Delayed methods, which 
represent the most common approaches for addressing the problem of initializing new features in 
bearing-only SLAM. 

2. Related Work 
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Bearing-Only SLAM has received most attention in the current decade. Therefore many of the 
related approaches are actually very recent. In [9] Deans proposes a combination of a global 
optimization BA, (Bundle Adjustment) for feature initialization and Kalman Filter for state estimation. 
In this method, due to the limitation on the baseline on which features can be initialized and depending 
on the camera motion and the landmark location, some features cannot be initialized. Strelow proposes 
in [10] a similar method but mixing camera and inertial sensors measurements in an Iterated Extended 
Kalman Filter (IEKF). In [11] Bailey proposes a variant of constrained initialization for bearing-only 
SLAM, where past vehicle pose estimates are retained in the SLAM state so that feature initialization 
can be deferred until their estimates become well-conditioned. In that sense the past poses of the robot 
are stacked in the map, together with associated measures, until base-line is sufficient to permit a 
Gaussian initialization. The criteria used for determining whether the estimation is well-conditioned 
(Gaussian) is the Kullback distance. The complexity of the proposed sampling method to evaluate this 
distance is very high. 

Also there are some works that use other estimation techniques (apart to the EKF) in Bearing-Only 
SLAM like the Particle Filters (PF) based methods. In a series of papers, Kwok uses Particle Filters: 
in [12] variations to standard PF are proposed to remedy the sample impoverishment problem in 
bearing-only SLAM. In [13] initial state of features is approximated using a sum of Gaussians, which 
defines a set of hypothesis for the position of the landmark, and includes all the features inside the map 
from the beginning. On successive observations, sequential radio test (SRT) based on likelihoods is 
used to prune bad hypothesis. The way these hypotheses are initialized is not detailed, and 
convergence and consistency issues are not discussed. In [14] Kwok extends the algorithm using a 
Gaussian Sum Filter, with an approach similar to the proposed in [15] for bearing-only tracking. This 
method is perhaps the first undelayed feature initialization method. The main drawback of this 
approach is that the number of required filters can grow exponentially, and therefore computational 
load grows exponentially with the number of landmarks. 

Some of the most notably advances on Bearing-Only SLAM have been presented by 
Davison [4,16], who shows the feasibility of real-time SLAM with a single camera, using the 
well-established EKF estimation framework. In this work a Bayesian partial-initialization scheme for 
incorporating new landmarks are proposed where a separate Particle Filter is used for estimating the 
feature depth which is not correlated with the rest of the map. In that sense it maintains a set of depth 
hypotheses uniformly distributed along the viewing ray of a new landmark, with a particle filter in one 
dimension. Each new observation is used to update the distribution of possible depths, until the 
variance range is small enough to consider a Gaussian estimation, in this point the estimation is added 
to the map as a three-dimensional entity. Until this initialization occurs, the ray estimation is 
maintained in the system's single Extended Kalman Filter. A drawback of this approach is that the 
initial distribution of particles has to cover all possible depth values for a landmark, this fact makes it 
difficult to use when the number of detected features is large or when there are far features in the 
scene. As a result, its application in large environments is not straightforward, as it would require a 
huge number of particles. 

Jensfelt in [17] presents a method where the idea is to let the SLAM estimation lag behind N frames 
and using these N frames to determine which points are good landmarks and find an estimate of 
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their 3D location. Mainly the focus in this work is on the management of the features to achieve 
real-time performance in extraction, matching and loop detection. 

In [18] Sola presents a method based on Federate Kalman Filtering technique. With initial 
Probability Distribution Function (PDF) for the features, a geometric sum of Gaussians is defined. The 
method is an approximation of the Gaussian Sum Filter (GSF), which was used in [14], that permits 
undelayed initialization with simply an additive growth of the problem size. A drawback of this 
approach is that it does not cope with features at very large depths. In [19] Sola presents a similar 
method to the presented one in [18] but features are initialized with a delayed method. 

In [20] a FastSLAM [21] based approach is proposed by Eade. Here the pose of the robot is 
represented by particles and a set of Kalman Filters refines the estimation of the features. When the 
inverse depth collapses, the feature is converted to a fully initialized standard Euclidean 
representation. This approach for features initialization seems appropriate within a FastSLAM 
implementation, but it lacks for a more general framework. 



Table 1. Summary of methods. 


Approach 


Delayed / Undelayed 


Initial representation 


Estimation 


Deans [9] 


Delayed 


Bundle adjustment 


EKF 


Strelow [10] 


Delayed 


Triangulation 


IEKF 


Bailey [11] 


Delayed 


Triangulation 


EKF 


Davison [16] 


Delayed 


Multi-Hypotheses 


EKF 


Kwok (a) [13] 


Delayed 


Multi-Hypotheses 


PF 


Kwok (b) [14] 


Undelayed 


Multi-Hypotheses 


PF 


Sola [18] 


Undelayed 


Multi-Hypotheses 


EKF 


Lemaire [19] 


Delayed 


Multi-Hypotheses 


EKF 


Jensfelt [17] 


Delayed 


Triangulation 


EKF 


Eade [20] 


Delayed 


Single Hypotheses 


FastSLAM 


Montiel [22] 


Undelayed 


Single Hypotheses 


EKF 


Munguia[23] 


Delayed 


Triang.- S. Hypotheses 


EKF 


This work 


Delayed-Undelayed 


KF-Triang.- S. 


EKF 



Montiel et al. in [22] presented a method, where the transition from partially to fully initialized 
features does not need to be explicitly tackled, making it suitable for direct use in EKF framework for 
sparse mapping. In this approach the features are initialized in the first frame observed with an initial 
fixed inverse depth and uncertainty, determined heuristically to cover range from nearby to infinity, 
therefore distant points can be coded. Due to the clarity and scalability, this approach is a good option 
to be implemented. 

On the other hand, experiments show that initial fixed parameters can affect the robustness of the 
method, especially when an initial metric reference is used in order to recover/set the scale of the map. 
This fact motivated the authors to develop in [23] a delayed version of the above method. In this case, 
initial depth and uncertainty of each feature are dynamically estimated prior to add this new landmark 
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in the stochastic map. The works [22] and [23] are analyzed in the next sections. Table 1 shows a 
summary of the above methods. 

3. Problem Statement 

3.1. Sensor motion model 

Let us consider a bearing sensor, with a limited field of view, moving freely in 2DOF. The sensor 
state x v is defined by: 

K = [x v ,y v A> v x> v y> v o]' (1) 

where [x v ,y»9 v ] represents the center position and orientation of the sensor and [v x ,v y ,v e ] denoting 
linear and angular velocity. 

At every step it is assumed an unknown linear and angular acceleration with zero mean and known 
covariance Gaussian processes, a w and a w , producing an impulse of linear and angular velocity: 





-yW~ 

X 
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a w At 
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(2) 



The sensor motion prediction model is: 



f v = 



(3) 



V d(k) + V 9 

An Extended Kalman Filter propagates the sensor pose and velocity estimates, as well as 
feature estimates. 



X v(k+\) 




^v(i+l) 




U v(k+\) 




V x(ir+1) 




V y(k+l) 




V 9(k+\) 





3.2. Features definition and measurement model 

The complete state x that includes the features y is made of: 

/\ r /v /v /v /y. ~\T 

x = L x v,yi,y2v-y n J 

where a feature y represents a feature i defined by the 4-dimension state vector: 

9i =[*i>yiA>PiY 

which models a 2-D point located at: 



(4) 



(5) 
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where xuyt is the sensor center coordinates when the feature was first observed; and Q\ represents the 
azimuth (respect to the world reference W) for the directional unitary vector m(8i). The point depth 
along the ray is coded by its inverse p, = l/di (Figure 1). 



The use of an inverse depth parameterization for bearing-only SLAM can improve the linearity of 
the measurement equation even for small changes in the sensor position (corresponding to small 
changes in the parallax angle), this fact allows a Gaussian distribution to cover uncertainty in depth 
which spans a depth range from nearby to infinity. It is well known the relevance of a good uncertainty 
Gaussian representation in a scheme based in EKF [24]. 

Figure 2 shows a simulation for a point reconstruction from noisy bearing measurements at different 
parallax (upper plot), using both, the Euclidean and the Inverse Depth parameterization. The location 
of the vehicle is known. A Gaussian error ao= 1° (degrees) is introduced in bearings. Lower plots 
show the evolution of the likelihood for depth and inverse depth as the parallax in the observation 
grows: in (a), the estimates of depth likelihood converge to a Gaussian-like shape, but the initial 
estimates are highly non-Gaussian, with heavy tails. In contrast, likelihoods of inverse depth (b) 
(abscissa in inverse meters) are nearly Gaussian, even for low parallax. Therefore it can be clearly 
appreciated how the uncertainty can be represented by a Gaussian using the inverse depth 
parameterization over whole parallax range, whereas the Euclidean representation converges to a 
Gaussian-like shape only to the final estimates. For the Euclidean representation, the parallax needed 
for the likelihood converges to a Gaussian-like shape, depending on the sensor noise, and thus the 
noisier the sensor the more parallax is needed for convergence. 

The different locations of the sensor, along with the location of the already mapped features, are 
used to predict the feature angle h, (angle describing the direction of the feature in the sensor 
coordinate frame). The measurement model is defined by: 



Figure 1. Inverse depth parameterization. 
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h, = atan2 



f 1 1 

— sin 6> + y . - j v , — cos 0. + x. - x, 

\P> Pi 



(7) 



atan2 is a two-argument function that computes the arctangent of ylx given y and x, within a range 
of [-71, ji]. At this stage it is assumed that the bearing sensor is capable of tracking and discriminating 
between the landmarks, in other words, the data association problem is obviated. 

Figure 2. Simulation of a point reconstruction from observations with different parallax. 
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In implementation using real data, features search could be constrained to regions around the 
predicted /z,. These regions are defined by the innovation covariance matrix Si = HP^+iH' + R where 
Hi is the Jacobian of the sensor model with respect to the state, Pk+i is the prior state covariance, and 
measurements z are assumed corrupted by zero mean Gaussian noise with covariance R. 

As it was stated before, depth information cannot be obtained in a single measurement when 
bearing sensors are used. To infer the depth of a feature, the sensor must observe it repeatedly as the 
sensor freely moves through its environment, estimating the angle from the feature to its center. The 
difference between angle measurements is the feature parallax. Actually, parallax is the key that allows 
to estimating features depth. In the case of indoor sequences, centimeters are enough to produce 
parallax, on the other hand, the more distant the feature, the more the sensor has to travel to produce 
parallax. Therefore, in order to incorporate new features to the map, special techniques for features 
system- initialization are needed in order to enable the use of bearing sensors in SLAM systems. 

Let us consider two methods, which represent the main approaches (undelayed and delayed) for 
addressing the initialization problem. 
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3.3. ID-Undelayed initialization 



For the Inverse depth (ID) undelayed method presented in [22], transition from partially to fully 
initialized features do not need to be explicitly tackled; this means that the feature is added to the map 
in its final representation since the first frame was observed. The initialization includes both the 
feature state initial values and the covariance assignment. The initial uncertainty region covers a huge 
range depth [d min , <x>] as Gaussian because the low linearization errors, due to the inverse depth 
parameterization. Once initialized, the feature is processed with the standard EKF 
prediction-update loop. 

Using the inverse depth parameterization, while the feature is observed at low parallax, the feature 
will be used mainly to determine the sensor orientation but the feature depth will be kept quite 
uncertain; if the sensor translation is able to produce a parallax big enough then the feature depth 
estimation will be improved. 

For the ID-Undelayed method (Figure 1), a new feature y new (Equation 5) is initialized, when is 
detected the first time k, as follows: 



(8) 



x. 

1 




X 

V 


y t 




y v 


A 




0 v +z g 



where x v , y v , 8 V are taken directly from the current state x v and zq is the initial bearing measurement. 

The initial value for p, is derived heuristically to cover in its 95% acceptance region, a working space 
from infinity to a predefined close distance d m i n expressed as inverse depth: 



1 



,0 



so: p. = a = ^ 
2 4 



Pn 



1 



(9) 



In [22] the parameters are set as d min = 1, p ; = 0.5, c p = 0.25. The new system state x is conformed 
simply adding the new feature y; to the final of the vector state: 



x = 



X v 

Yi 

y 2 



X — 

new 



X v 

Yi 

y 2 

Yi 



(10) 



The state covariance after feature initialization is defined by: 

P, k) 0 0 

V=J 0 R; 0 



0 0 al 



J' 



(11) 



being J the Jacobian for the initialization function. 
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3. 4. ID-Delayed initialization 



In experiments using the undelayed initialization, it often happens that the inverse depth becomes 
negative after a Kalman update, due to the observation noise that predominates over the update of the 
depth, but there are simple solutions to solve this problem. Moreover, when an initial metric reference 
is used in order to recover/set the scale of the map (very relevant for robotics applications), initial 
fixed parameters (inverse depth and uncertainty) must be tuned in order to ensure convergence. 

Figure 3 illustrates the SLAM process using the ID-Undelayed following a simple straight 
trajectory for 5 features, (4 of them near to the vehicle and the other more distant). In upper plots (a,b 
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and c) the initial parameters (p 0 = 0.01, <r p = 0.005) are set in order to initialize the features at the 
middle of the distance between the vehicle and the more distant feature, and therefore far enough 
respect to the nearby features (plot a). In this case we found in almost every case a huge drift in the 
estimates (plot c), and it can be appreciated that initializing the features far away from the sensor fails 
if there are some closest landmarks. If other values (p 0 = 0.5, <j p = 0.25) are used (near to the sensor), 
(central plots d, e and f) then the percentage of convergence is poor (approx 50%) (plot f), due to the 
influence of the distant feature. In this case if the distant feature is removed from the map (not 
illustrated here) then a percentage of convergence of 80% is achieved. In the last series (lower plots g, 
h and i) we combine both initial values: p 0 = 0.01, a p = 0.005 for the distant landmark and p 0 = 0.5, a p = 
0.25 for the all the nearby ones. In this case an effectiveness of 90% was achieved for the algorithm. 

The issues mentioned above suggest us that the initial inverse depth and their associated initial 
uncertainty of the new features added to the map could be treated before to be added to the system 
state instead of use fixed initial depth and uncertainty. In [23] a delayed version of an undelayed 
method is proposed. In this case, initial depth and uncertainty of each feature are dynamically 
estimated prior to adding the new landmark in the stochastic map. 



For the ID-Delayed method, a new feature y new (Equation 5) is initialized as follows: 
When a feature is detected the first time k, some part of the current state x and covariance matrix P 
together with the sensor measurement are stored, this data X (called candidate points) is composed by: 



The values x\, y\ and 6\ represent the current robot position; o\, a/ and o\ represent their 
associated variances taken from the state covariance matrix Pk and zi is the first bearing measurement 
to the landmark. In subsequent instants k, the feature is tracked until a minimum parallax threshold a m \ n 



Figure 4. ID-Delayed parameterization. 





(12) 
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is reached. Figure 4 shows that a few degrees of parallax are enough to reduce the uncertainty in 
the estimation. 

The parallax a is estimated using: 

(i) The base-line b. 

(ii) Xi, using its associated data (x\, y\, 6\, z\, a x \, a y i > a 9 \). 

(iii) The current state (x v , y v , 6 V , z, a x v , f/y, cf° y ). 

For each candidate point X{, every time that a new bearing measurement z is available, the parallax 
angle a can be estimated as (Figure 4): 

a = n- {(3 + y) (13) 

The angle /? is determined by the directional unitary vector h\ and the vector b\ defines the base-line 
b in the direction of the sensor trajectory. 

The angle y is determined in a similar way as but using the directional unitary vector A 2 and the 
vector bi defining the base line in the opposite direction of the sensor trajectory by: 



(3 = cos' 1 



y = cos 



h 2 ■ b 2 
\\hA\\b, 



(14) 



where (h\ ■ b\) is the dot product between h\ and b\. The directional vector hi, expressed in the 
absolute frame W, points from the sensor location to the direction when the landmark was observed for 
the first time, and is computed using the data stored in Xi denoting the bearing z t . The directional vector 
hi expressed in the absolute frame W is computed in a similar way as h\ but using the current sensor 
position x v and the current measurement z,-. 



cos(6' 1 +zj 

8^(6*! +Zj) 



h 2 = 



cos(^+z,.) 
_sin(^ v + z ; .)_ 



(15) 



b\ is the vector representing the robot base-line between the robot center position x\, y\ stored in X t 
where the point was first detected and the current sensor center (x v , j v ). bi is equal to b\ but pointing to 
the opposite direction. The base-line b is the module of bi or b\\ 



/, = ll /, i| = l /, 2|| \ = [(^-^i)»(^v-^i)] b 2 = [{ x i- x v )>(yi-y v )] 



(16) 



If a > a m m then A, is initialized as a new feature y;. The threshold a m ; n can be established depending 
on the accuracy of the bearing sensor. Depth uncertainty is reasonably well minimized when a = 10°. 

For a new feature y;, values of x,, y t , 0 t are defined in the same way as Equation 8. For the delayed 
approach the dynamical estimation of p, is derived from: 

sin« 



Pi 



6 sin P 



(17) 



The variance a p for the inverse depth p is calculated now from the initialization process, instead of a 
variance predefined heuristically as it was made in the undelayed method, therefore the covariance for 
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the new feature y new is derived from the error diagonal covariance matrix R t measurement and the state 
covariance matrix P. 



For reasons of simplicity R t is defined as a diagonal matrix (cross-covariances are not taken into 
account) and is now conformed by the error variance of the standard deviation of the bearing sensor a z 
(one for each bearing estimation z\ and z) and the variances stored in A,- (o\, a/ and a®). Note that the 
value of c z is constant and is not stored previously in Equation 12. 

The new state covariance matrix, after initialization, is: 



Note that unlike the ID-Undelayed method there is not an implicit initial uncertainty in depth c p 
(Equation 11). In the ID-Delayed method the complete covariance for the new feature is fully 
estimated by the initialization process. 

3.5. Undelayed vs. Delayed 

In an Undelayed approach, when a feature is added to the map the first time that it has been 
observed, its depth is modeled with a huge uncertainty. In that sense, this new feature does not provide 
any depth information. However, at this stage the benefit of the Undelayed approach is that features 
provide information about the sensor orientation from the beginning. 

On the other hand, it can be useful to wait until the sensor movement produces some degrees of 
parallax, (gathering depth information) in order to improve robustness, especially when an initial 
metric reference is used for recovering scale. Moreover, when cameras are used in real cluttered 
environments, the delay can be used for efficiently reject weak features, thus initializing only the best 
features as new landmarks to the map. 

4. Concurrent Initialization 

This section presents a novel and robust method, called Concurrent Initialization, for initializing 
new features in bearing-sensor-based SLAM systems. The method takes advantage of both, undelayed 
and delayed approaches. When a feature is detected for the first time, it is immediately initialized in 
the map (undelayed) as a directional vector which contributes since the beginning to the estimation of 
the sensor orientation. After that, while the sensor moves freely through its environment, the incoming 
measurements are incorporated via an uncorrelated linear Kalman filter in order to estimate the depth 
of the feature. If the sensor movement produces enough parallax, then the feature will be updated with 
the estimated depth, and thus also contributing to the estimation of the sensor location. Very far 
features will not produce parallax, and will remain in the form of a directional vector in the map, but 
contributing to the estimation of sensor orientation. Figure 5 illustrates the concurrent initialization 
process. 




(18) 



new 




(19) 
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4.1. Undelayed stage 

When a feature is detected for the first time k, it is initialized immediately in the map as a new 
landmark yL(i) which is composed by the 3-dimension state vector: 

y L (i) = [*i,y iAY (20) 

where 















A. 




Av+ Z e_ 



x v , y v , 0 V are taken directly from the current state x v and ze is the initial bearing measurement. yL(i) 

defines a directional vector, expressed in the absolute frame W, which represents the direction of the 
landmark from the sensor, when it was observed for the first time. The covariance matrix P is updated 
in the same manner as Equation 19 but using the proper JacobianJ. 

Figure 5. Concurrent initialization process. 




Parallel to the system state (represented by the state vector x), a state vector x can is used for 
estimating (via an extra linear Kalman Filter) the feature depth of each landmark yL- The state x can is 
not directly correlated with the map. 

Every time a new feature y L Q is initialized in x, the state x can is augmented as: 
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X can _ 



A. 



(22) 



where is a 3-dimension vector composed by: 



(23) 



For each Xd, a; is the estimated parallax, Act; is the rate of change in parallax and p, is the estimated 
inverse depth. 

The covariance matrix of x can , P can , is augmented simply by: 



P 0 

can 

0 R„ 



(24) 



The three initial values of h are set to zero, and the initial values of R c have been heuristically 
determined as: R c = diag(.0\, .01, 1). 

4.2. Delayed stage 

While the sensor moves through its environment, it can observe repeatedly a landmark y^, at each 
iteration generating a new angle measurement z. All these new measurements are successively added 
to the linear Kalman Filter (responsible for estimating x can ) in order to infer the landmark depth. For 

each new measurement z,- of a feature yL(i) an iteration of the filter is executed. 
The state transition model for each 1\ is: 



(25) 



A process noise Wk~ N(0,Qk) is considered. In experiments: Qjt = diag(8e 1 , lOe 9 ) have been used. 

The measurement prediction model is directly obtained from the state. On the other hand, the 
measurements z ca „used to update the filter are a function of: (i) the feature yL(i), (ii) the sensor state x v 
and (iii) the current measurement z,-. 





a i(k+\) 




«/(*)+ Aa /(*) 


X can_i(k+1) 










Pi(k+\) 







^can 



: /z($Wx v ,z ( .) 



(26) 



z a and z p are estimated in the same manner as Equation 13 and Equation 17 respectively. Only note 
that in Equation 16, X\ and ji are taken from y L (i), so X\ = %i and ji =yi. 

The implicit uncertainties in the estimation of the function f z are used to compose the error 
measurement covariance matrix R; an : 

^=V/ z (P t )V/ z ' (27) 
where Vf z is the Jacobian of f z with respect to z can . P t is formed by: 
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P. = 



-yL ( i)X v 
0 



yut, 
0 



0 
0 



(28) 



(7_ 



All the components of P t , except a z (the error variance of the bearing sensor) are taken directly from 
the covariance matrix P of the system state x. Pi„ is the submatrix of P corresponding to the covariance 

of the sensor state x v . is the covariance of the feature yL(i). P y and Py L(il i v are the correlations 

between x v and yL®. 

R can is used in the Kalman update equations for estimating the innovation covariance matrix 5*,. 



4.3. Updating depth 



Features expressed in the form of yL(i) are very useful to estimate the sensor orientation 9 V . In [25] a 
visual compass is proposed based in this fact. Besides, the depth of features are needed for estimating 
the sensor location [x v ,_y v ]. For near features y L (i), a small sensor translation is enough to produce some 
parallax and thus to infer depth. 

Figure 6. Parallax and depth estimations for a distant and a close feature. 




steps 

d = 1000 d = 50 



The state x can encloses the parallax a; and inverse depth p, estimations for each feature yL(i> Figure 6 
shows the evolution of parameters a; (upper plot) and d = l/p ; (lower plot) and its uncertainty for the 
feature estimated by the linear Kalman filter, (left plot) feature at a distance of d = 1,000 units and 
(right plot) for a distance of d = 50 units. The boundary uncertainties at 3a are indicated in blue color. 
The filtered values are depicted in red color. Also note in green color the raw measurements z can (taken 
from Equation 26). In these graphics it can be clearly appreciated how the estimation of depth d is 
directly influenced by the parallax; for the near feature, about 100 steps are needed to producing 
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parallax and thus d converges rapidly to its real value. Also note that the uncertainty is rapidly 
minimized. On the other hand, the distant feature produces small parallax. For a low parallax the 
sensor noise predominates and therefore produces very fluctuant raw measurements. Even so, after 
several steps, the filter estimates the depth reasonable well with a high related uncertainty. 

A minimum parallax threshold a m i n is used for updating a feature yip) as y;. Distant features will not 

produce parallax and therefore will remain expressed as yL(i), but contributing to the estimation of 

sensor orientation 6 V . 

On the other hand, if a; > a m i n then: 

y L(I) = k^y^f -> 9i = l^Yi A>pJ 



where /?/ is taken directly from x can . 
corresponding Jacobian: 



The covariance matrix P 



P = J 



0 qol 



J' 



(29) 

is transformed by the 
(30) 



being c p y the variance of the inverse depth estimation for y L (i) and taken from P can . The constant q is 

used to increase the initial uncertainty of p ( in order to improve filter consistency. In experiments q is 
set to 100. 



(31) 
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0 


J = 


0 
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When a feature y L (i) is updated as y;, then its corresponding values will be removed from the linear 
Kalman filter responsible for estimating the state x can . 



4.4. Measurement 



At any time, the map can include both kind of features yL(i) and y;. Thus each kind of feature has its 
own measurement prediction model: 

- For features y; measurement Equation 7 is used. A value of 2 to 6 times the real error of the 
bearing sensor is considered for the measurement process. 

- Features yL(i) are supposed to be very far from the sensor and therefore it is assumed that its 
corresponding bearing measurement z/will remain almost constant. The measurement prediction 
model is simply: 

h m =0 i-°v (32) 

- For near features yL(i), the bearing measurement Zj will change rapidly. Due to this fact the 
standard deviation of the bearing sensor c z for features yL(i), (in update Kalman equations) is 
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multiplied by a high value c, (c = lOe 1 were used in experiments). An interesting issue, to be 
treated for further work, could be the dynamical estimation of parameter c. 

Both kinds of measurement prediction models must be used together, if several measurements z, are 
made at the same time for both kinds of features. For example, consider that the bearing sensor takes 
measurements of the features yip) and y3 simultaneously, then: 



(33) 
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0 0 


dh, 


0 


3x v 




dy 3 





(34) 



where v is the innovation, R is the error covariance matrix and VH is the Jacobian measurement 
model. 



5. Experiments 



Several simulations have been executed in order test the performance of the proposed method in 
relation to other approaches. Simulations are extremely helpful when different methods are compared 
among others, because numerous variables (inherent to real environments) are obviated (e.g., data 
association), or become constants (i.e., synthetic map), therefore the cores of the methods are 
compared. 



5.1. Initialization process of distant and near features 



Figure 7 shows the evolution in the initialization process of two features maps with the concurrent 
initialization: a distant one (600 units of depth) and a near one (50 units of depth). The only 
information given a priori to the system was the scale reference (the three points in yellow) which was 
introduced with an associated uncertainty close to zero in the covariance matrix R. Taking into account 
that there is not an additional sensory input (e.g., odometry), at every step an unknown linear and 
angular acceleration is introduced with zero mean and known-covariance Gaussian processes (Section 
3. 1). In this case, a x = 4 m/s , a y = 4m/s and a 9 = 2 rad/s were used. The only input sensor of the 
system is a noisy sensor capable of measuring the bearing of features, with a limited field of view of 
110° (emulating a 2-DOF camera). A standard deviation of a = 1° is considered for the sensor readings. 

At the begin of the sequence (plot a) both features has been initialized as a directional vector y L (i), 

defining a ray (in cyan color). Note that three feature points (in yellow color) have been previously 
added to the map as a priori known landmarks in order to define/set the scale of the world. Around 
step 100 (plot b), the small displacement of the sensor to the right produces enough parallax to 
estimating the depth of the near feature. Observe that the near feature yL(i) has been transformed to a y; 
feature (blue color). Also note that some uncertainty (especially in depth) remains at the moment of the 
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transformation. By the last step, at 250 iterations, (plot c) the uncertainty in the near feature has been 
full minimized, on the other hand, the movement of the sensor has not produced enough parallax and 
therefore, the distant feature remains in the form of yip) but still contributing to the estimation of the 
sensor orientation. 



Figure 7. Sequence illustrating the concurrent initialization process: (a) beginning of 
initialization; (b) initialization process at step 100; (c) initialization process at step 250. 
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5.2. Comparative study 

In order to show the performance of the Concurrent method proposed in this article, a comparative 
study between the ID-Undelayed method [22] and the ID-Delayed [23] method is presented. 

Figure 8 illustrates the environment setup used in the study. For all the tests, the bearing sensor is 
moved over a semi-cycled U-like shape trajectory, since our main goal is to observe the effect of the 
initialization process of new features in the estimation of both map and sensor location, instead of the 
closing loop problem. About 100 landmarks (in green) simulate the environment of the sensor. Figure 
8 also shows both the map and sensor trajectory estimates after a single run of 2,000 steps of the 
Concurrent method. The features map and their uncertainties are indicated in blue. Note the evolution 
of uncertainty in both sensor and features location. Also note the typical (in SLAM systems) drift in 
both trajectory and map estimations as the sensor moves far away from its initial location. 

The average NEES (normalised estimation error squared [26]) over N Monte Carlo runs of the filter 
was used in order to evaluate the consistency of the methods, as it is proposed in [27]. The NEES is 
estimated as follows: 

=[ x k -x k!k T P k|k[ x i -V] (35) 
where is the true state and the average NEES is computed as: 



Sensors 2010, 10 



1529 



1 N 

** = #5>*(0 (36) 

Four different tests were realized to comparing the methods under diverse conditions. Table 2 
shows the values for the linear and angular acceleration (a w x , a w y and a W e in 4m/s 2 ) and the time 
between "frames" (At in seconds) used for each test. In this case, a higher At implies bigger 
displacements of the robot from frame to frame and therefore more linearization errors due to large 
changes in parallax. 



Table 2. Setup of the tests. 



Test 


w 
a x 


\v 
a y 


w 
a e 


At 


a 


4 


4 


2 


1/30 


b 


4 


4 


2 


1/120 


c 


6 


6 


3 


1/30 


d 


6 


6 


3 


1/120 



In experiments p ini = 0.05 and a p = 0.025 were used for the ID-Undelayed method and a m i n =10° 
were used for both ID-Delayed and Concurrent methods. The NEES was estimated over 
the 3-dimensional robot pose. The average NEES for each method was estimated using N=20 Monte 
Carlo runs. In simulations the sensor was moved 3 meters every At = 1 second for the straight sections 
and 4.5 meters every At = 1 seconds for the curve sections of the trajectory. 

MATLAB code was run using a 1.73 GHz Pentium M laptop. Table 3 shows the execution time of 
the three methods for conditions At = 1/30 and At = 1/120. As it would be expected, for At = 1/120 the 
execution time was around four times longer than At = 1/30, for the same trajectory. Execution time of 
the Concurrent method was estimated using a diagnostic version of the algorithm, marked with an 
asterisk (*) in Table 3. This version uses a 5n Kalman filter (instead of 3n) which estimates two extra 
diagnostic parameters. This means that the real execution time of the Concurrent method should be 
somewhat faster. 



Table 3. Execution time & Failed attempts. 



Method 


At =1/30 


At 

1 it ^ r\ 


a 


b 


c 


d 


ID-Undelayed 


76s 


302s 


2 


l 


0 


2 


ID-Delayed 


60s 


235s 


9 


2 


ll 


4 


Concurrent 


94s* 


376s* 


0 


0 


0 


0 



For some runs of the algorithms, filter divergence could occur. Table 3 also shows the failed 
attempts, this means the number of times that filter divergence appeared before a method reach N = 20 
positives runs (with convergence), for each test. The average NEES was only estimated using 
positive runs. 
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Figure 8. SLAM simulation using the concurrent initialization method. 
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Figure 9 shows the evolution of the average NEES (sk) for tests (a), (b), (c), (d) for each method. 

Note that test (b) and (d) need four times more steps than test (a) and (c) in order to complete the same 
trajectory due to their particular value of At. 

As it would be expected, the whole methods become optimistic after a certain time [27]. 
Nevertheless, it can be observed for tests (a), (b) and (c) that Wk achieved for Concurrent method (red) 

tends to be lower than e achieved for both ID-Undelayed (blue) and ID-Delayed (green) methods. 

This behavior is more evident in test (a) and (c), where lower noise is injected into the sensor motion 
model. The difference between (a) and (c) is the time between "frames" (At) used for each test. The 
above results suggest, that if adequate noise is injected, then Concurrent method seems to be less 
sensitive to large jumps in parallax (and linearization errors), and therefore could be suitable for 
application where a high "frame-rate" is not available. Regarding to the ID-Undelayed method, it can 
be observed the effect of the parameter At over the magnitude of Sk, although its form is somewhat 
similar for all tests. 

In test (a) and (c) where At is higher (1/30), st reaches around 600 units, while in test (b) and (d) where 

At is lower (1/120) Sk reaches around 200 units. Theoretically the Sk for the ID-Undelayed method 

would be more favorable as At—*0. At this point, it is important to remember that implementation of 
Concurrent method implies the estimation of an extra linear Kalman filter of dimension 3n, where n is 
the number of features yu§ in the state x. On the other hand, the Concurrent method is lees sensitive to 
the parameter At. At least for this experimental setup, the average NEES for the Concurrent method, 
when At = 1/30 (94 s of execution time), is even lower than the Sk for the ID-Undelayed, when 
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= 1/120 (302 s of execution time). The ID-Delayed method represents the most efficient alternative 
in computational cost with medium performance in average NEES terms. However the ID-Delayed 
method shows to be the least robust method in terms of convergence (Table 3). The problem of 
convergence for the ID-Delayed method is notorious in tests (a) and (c) where At is higher (1/30), 
these results indicate a frame-rate dependency of the method. On the other hand, for the realized tests, 
the Concurrent method shows the best results in terms of convergence. 

Figure 9. Comparison of the average NEES for ID-Undelayed, ID-Delayed and 
Concurrent methods. 
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6. Conclusions 



This work proposes a novel and robust approach for initializing new features in SLAM systems 
based in bearing sensors. First, an overview of the problem is given and the most relevant related work 
is presented. Most of the methods presented in the literature can be classified into two categories: 
Undelayed and Delayed Methods. In that sense, an analysis of two representative methods of the 
aforementioned taxonomy is also presented. Undelayed methods provide information of orientation 
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since a feature is first detected, on the other hand, Delayed methods await until some depth 
information is gathered, improving convergence. 

The proposed approach in this article, the concurrent initialization method, takes the best of both, 
Undelayed and Delayed approaches. The key is to use two kinds of feature representations 
concurrently for both undelayed and delayed stages of the estimation. The simulations results, based in 
the average NEES test, showed that the Concurrent method can maintain the filter consistency 
satisfactorily. Moreover, observing the percentage of convergence, the Concurrent method appears 
also to be robust. 

It is important to mention that the complexity of the Concurrent method is higher than other 
methods {i.e., ID-Undelayed method) and the additional Kalman filter certainly implies an increase of 
computational requirements per frame. On the other hand, the Concurrent method appears to be less 
sensitive to the linearization errors induced for large jumps in parallax (much time between frames). In 
that sense the Concurrent method could be even more efficient in computational terms than other 
methods because it seems to work properly at low frame rate. This attribute also makes it suitable for 
applications where a high frame rate is not available for different reasons. The concurrent initialization 
method could be a robust alternative to bearing sensor based SLAM systems. 
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